Emotion recognition based on phoneme classes
نویسندگان
چکیده
Recognizing human emotions/attitudes from speech cues has gained increased attention recently. Most previous work has focused primarily on suprasegmental prosodic features calculated at the utterance level for modeling against details at the segmental phoneme level. Based on the hypothesis that different emotions have varying effects on the properties of the different speech sounds, this paper investigates the usefulness of phoneme-level modeling for the classification of emotional states from speech. Hidden Markov models (HMM) based on short-term spectral features are used for this purpose using data obtained from a recording of an actress’ expressing 4 different emotional states anger, happiness, neutral, and sadness. We designed and compared two sets of HMM classifiers: a generic set of “emotional speech” HMMs (one for each emotion) and a set of broad phonetic-class based HMMs for each emotion type considered. Five broad phonetic classes were used to explore the effect of emotional coloring on different phoneme classes, and it was found that spectral properties of vowel sounds were the best indicator of emotions in terms of the classification performance. The experiments also showed that the better performance can be obtained by using phoneme-class classifiers than generic “emotional” HMM classifier and classifiers based on global prosodic features. To see the complementary effect of the prosodic and spectral features, the two classifiers were combined at the decision level. The improvement was 0.55% in absolute (0.7% relatively) compared with the result from phoneme-class based HMM classifier.
منابع مشابه
Automated vocal emotion recognition using phoneme class specific features
Methods for automated vocal emotion recognition often use acoustic feature vectors that are computed for each frame in an utterance, and global statistics based on these acoustic feature vectors. However, at least two considerations argue for usage of phoneme class specific features for emotion recognition. First, there are well-known effects of phoneme class on some of these features. Second, ...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملClass-level spectral features for emotion recognition
The most common approaches to automatic emotion recognition rely on utterance level prosodic features. Recent studies have shown that utterance level statistics of segmental spectral features also contain rich information about expressivity and emotion. In our work we introduce a more fine-grained yet robust set of spectral features: statistics of Mel-Frequency Cepstral Coefficients computed ov...
متن کاملModeling phonetic pattern variability in favor of the creation of robust emotion classifiers for real-life applications
The role of automatic emotion recognition from speech is growing continuously because of the accepted importance of reacting o the emotional state of the user in human–computer interaction. Most state-of-the-art emotion recognition methods are based on urnand frame-level analysis independent from phonetic transcription. Here, we are interested in a phoneme-based classification f the level of ar...
متن کامل